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This paper studies the heat flow on Finsler manifolds. A Finsler manifold is a smooth manifold 
M equipped with a Minkowski norm F(x, •) : T X M — > R + on each tangent space. Mostly, we 
will require that this norm is strongly convex and smooth and that it depends smoothly on the 
base point x. The particular case of a Hilbert norm on each tangent space leads to the important 
subclasses of Riemannian manifolds where the heat flow is widely studied and well understood. 
We present two approaches to the heat flow on a Finsler manifold: 

• either as gradient flow on L 2 (M,m) for the energy 

8(u) = i f F 2 (Vu)dm; 
2 Jm 

• or as gradient flow on the reverse L 2 -Wasserstein space T > 2{M) of probability measures on 
M for the relative entropy 

Ent(u) = / ulogudm. 
Jm 

Both approaches depend on the choice of a measure monM and then lead to the same nonlinear 
evolution semigroup. We prove C 1 '"-regularity for solutions to the (nonlinear) heat equation on 
the Finsler space (M,F,m). Typically, solutions to the heat equation will not be C 2 . More- 
over, we derive pointwise comparison results a la Cheeger-Yau and integrated upper Gaussian 
estimates a la Davies. 

1 Finsler Manifolds 
1.1 Finsler Structures 

Throughout this paper, a Finsler manifold will be a pair (M, F) where M is a smooth, connected 
n-dimensional manifold and F : TM — ► H + is a measurable function (called Finsler structure) 
with the following properties: 

(i) F(x, of) = cF(x, f ) for all (x, f ) E TM and all c> 0. 

(ii) For each point x £ M, there are a local coordinate system (x J )™ =1 on a neighborhood U of 
x and positive numbers A and A* such that, for almost every x 6 U, the function F 2 (x, •) 
on T X M \ {0} is twice differentiable and the (n x n)-matrix 

9 2 n^, 



9i ^ := WW{2 F(x '°i (L1) 
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is uniformly elliptic on U in the sense that 

A *E^) 2 ^ E 9i^,ovw < -JZw? (i-2) 

i=l i , j = 1 i=l 

holds for all £ G T X M \ {0} and all 77 G T X M. Here (a;*, )f =1 denotes the local coordinate 

system on 7r _1 (f7) C TM given by £ = X^ILi C(d/dx l ). We will say that such a point x is 

The uniform ellipticity (|1.2|) in particular implies 

n 1 n 

A*^) 2 ^^,^-^) 2 (1.3) 
i=i i=i 

and thus the existence of positive constants k and k* with 

n 1 

K*F 2 (x,rj) < V ^(x,0r/V < ~F 2 (x, V ) (1.4) 

for almost all x G U and all £,77 G T X M \ {0}. This (coordinate-free) inequality in turn implies 



f2 ( x ' 2) ~ \ p2{x ' c) + ^ f2(x ' 7?) " T F " (x ' 77 " (L5) 

for all x G f/ and £, 77 G T^M (see |BCLj . [Oh3| ). For any subset f2 C M, the largest constants 
k,k* G (0, 1] such that (jl.4[) holds for all x G will be denoted by kq and Kq. The constants 
an< ^ l/y / '^n are also known as 2-uniform convexity and smoothness constants. Let us 
remark that = 1 (or = 1) if and only if F(x, •) is a Hilbert norm for each x G f2. 

A nonnegative function || • || on R n is called Minkowski norm — and the pair (R n , || • ||) is 
then called Minkowski space — if ||x|| > 0, ||cx|| = c\\x\\ and ||x + y\\ < \\x\\ + ||y|| hold for all 
x, y G R n \ {0} and c > 0. Thus a Finsler structure F on M induces for a.e. x G M a Minkowski 
norm F(x, •) on the tangent space T X M. 

Observe that there is a one-to-one correspondence between Minkowski norms || • || : R n — > R+ 
and convex, bounded open sets B C R ra containing the origin: given || • ||, B will be the 
open unit ball {x G R n : ||x|| < 1}; given B, the associated Minkowski norm is defined by 
||x|| = inf{c > : c _1 x G B}. Obviously, || • || will even be a norm if and only if B is symmetric 
(i.e., x G B if and only if — x G B). 

The reverse Finsler structure F of F is defined by F(x,£) := F(x,— £). We say that -F is 
reversible (or absolutely homogeneous) if F = F. 

Usually in differential geometry, Finsler manifolds are assumed to be smooth in the sense that F 
is smooth on TM \ {0}. Note that we never require smoothness at the zero section. Requiring 
that F 2 (x,-) is C 2 on all of T X M implies that F(x,-) is a Hilbert norm (see |ShH Proposition 
2.2]). 

1.2 The Legendre Transform 

For a Finsler structure F on M, we define the dual structure F* : T* M — > R + by 

F*(x,a) = sup{a£ : £ G T^M, F(x,£) < 1} 

for (x, a) G T*M with regular x. For each regular x G M, we remark that F*(x, •) is a Minkowski 
norm on T*M and set 

r) 2 /I 
' 1 -7*2, 
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for a = Yli=i al dx l E T*M \ {0} in a local coordinate system (x l )^ =1 . (Here F* 2 (x, •) is indeed 
twice differentiable on T*M \ {0}, see Lemma ll. If iii) below.) 

The Legendre transform or transfer map J* : T*M — » TM assigns to each a £ T*M with regular 
x the unique maximizer of the function 

e h+ ^-If 2 (x,o-^* 2 (^,«) (i-7) 

on T X M. (The last term is unnecessary but inserted for the sake of symmetry.) The uniqueness 
is guaranteed by the strict convexity of F(x,-). The vector J*(x,a) can be characterized as 
the unique vector £ E T X M with F(x,£) = F*(x,a) and a£ = F*(x, a)F(x, £). We can define 
J : TM — > T*M in an analogous way and then J(x,£) is the unique maximizer of (jl.7p as a 
function of a. 

We recall several standard properties of the Legendre transform which can be found in [BCSj 
§14.8] for instance. 

Lemma 1.1 Fix regular x E M. 

(i) It holds that J* = J- 1 on T*M. 

(ii) For any a = YJl=i a i dx i E T*M and £ = Ya=i Cid/dx 1 ) E T X M , we have 

(iii) g*j(x, a) in (| j.gjl is well-defined for all a E T^M \ {0} and we /lave, for all £ E T^M \ {0} 
and a E T*M\ {0}, 

n 

J(x,£) = «(*,£)£ = E StffoO^ G T*M, 

n ■ 9 

J*(x,a) = #*(x,a)a = ^ g^ix, a)a l -^-r E T^M. 

In particular, g*j(x,a) is the inverse matrix of gij (x , J* (x , a)) . 

(iv) For aW £, n E T X M , we have 

(J(x, v ) - J(x, o) (r? - o > «*F 2 (x, »7 - 0- (1-8) 

(v) TTte dwaZ structure F* satisfies estimates analogous to tyl.ty . (ji.gp . fl-?.^P , (|i.5p and (I J.<gj) 
TOf/i A* and k* in £/ie place of A and k, respectively, and vice versa. 

Proof: The existence of g*Ax,a) in (iii) is merely a consequence of the inverse function the- 
orem (for J and J*). As for (iv), since g = J by (ii), the mean value theorem implies that 
(J(x,rj) — J(x, — £) = (n — £,)g(x,~/)(r] — £) for some 7 on the segment between £ and 77. 
Using (jl.4p the RHS can thus be estimated from below by k*F 2 (x,t] — £). ■ 

Note that at the origin J*(x,-) is continuous but not differentiable (even if F is smooth on 
TM\{0}). 

Remark 1.2 Fixing a coordinate system, we may identify both T X M and T*M with the Eu- 
clidean space IR n . Given a vector £ E T X M of length F(x, £) = 1, the vector J*(x, £) corresponds 
to the unit normal vector at the point £ at the unit sphere in T X M (Figure 1.1). 
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For each regular x £ M and £ 



E T X M \ {0}, the map 




) 



1/2 



n 



(1.9) 



defines a Hilbert norm on T X M. It can be regarded as the best Hilbert norm approximation of 
the norm F(x, •) in directions close to £. More precisely, if £ G dB is a unit tangent vector, then 
the unit sphere dB^ associated with the norm F*(x, •) is the centered ellipse in R n approximating 
dB up to second order at the point £ (Figure 1.2). 

Example 1.3 (i) Riemannian spaces: Let M = R n and F be given by 



with a symmetric, positive-definite matrix a(x) = (aij(x)) on R n . Then g(x,£) = a{x) indepen- 
dently of £ and J(x,£) = a(x)£. Moreover, g*(x, a) = a(x)^ 1 and J*(x,a) = a(x)~ l a. 



(ii) P-spaces: Let M = R n and F{x,£) = ||£|| p = (£2=1 \C\ p ) 1/p for some 1< p < oo. Then 

^o = iieiir p -(irr¥) 



and F*(x,a) = ||a|| p * for the dual exponent p* satisfying 1/p + 1/p* = 1. However, || • || p is 
only 2-uniformly convex ifl < p < 2 (k* = p — 1, K = 0in (jl.5p ) and only 2-uniformly smooth 
if 2 < p < oo (k* = 0, k = l/(p — 1) in (|1.5p ). Therefore || • |L is uniformly elliptic only when 
p = 2. Nevertheless, we can still consider the Laplacian (see the next chapter). 

(iii) Deformation of Minkowski spaces: Let M = R n and F(x,£) = ||<7(x)£|| for some invertible 
matrix cr(x) and some Minkowski norm || • || on R n which is strictly convex and twice differentiable 
on R n \ {0}. (Case (i) is the particular case with Euclidean norm and a(x) = a(x) T a(x).) Then 
we have J(x, £) = a{x) T Jq{cf{x)^), g(x, £) = a(x) T go(cr(x)^)a(x) and F*(x, a) = Fq (aa^~^ (x)), 
where Jo, go and _F * are taken with respect to the original norm || • ||. 

(iv) Hilbert geometry: Let D C W 1 be a bounded open convex domain with smooth boundary 
dD such that D U dD is strictly convex. Given distinct x,y € D, let x' £ dD be the intersection 
of the half line x + H+(x — y) with dD. Similarly, let us denote by y' £ dD the intersection of 
y + K.+ (y — x) with dD. Then the Hilbert metric dn is defined by 



where | • | is the standard Euclidean norm. If D is the unit ball, then (D,d}j) coincides with 
the Klein model of the hyperbolic space. In general, dn arises from a Finsler metric of constant 
negative flag curvature (see |Eg|). 

(v) Teichmiiller metric: The Teichmiiller metric on Teichmiiller space is arguably the most 
famous Finsler metric in differential geometry. It is known to be complete, while the Weil- 
Petersson metric is Riemannian and incomplete (see [EE], |Woj ) . 

1.3 Regularization 

Various of the results presented in this paper also will be true for more general Finsler struc- 
tures, not satisfying our basic regularity assumption (|1.2|) with positive constants A and A* but 
just with nonnegative constants. However, each Finsler structure F of this type can easily be 
approximated by Finsler structures Fr e i satisfying our assumptions. We will illustrate this in the 
particular case of Minkowski norms on H n . 

For the sequel, we fix a Minkowski norm || • || and we denote by g(£) the Hessian of || • || 2 /2 at 
the point £. We say that the Minkowski norm || • || is regular if it satisfies (jl.2p with positive 
constants A and A*. 



n 



F 2 (x,0 = ^C i %W 



i=l 
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Moreover, we denote the Euclidean norm on by | • |. Note that the Hessian of | • | 2 /2 at each 
point is the identity matrix 1. We define the e- lower regularization of the Minkowski norm || • || 
by HCllleJ = \/ll£ll 2 + e l£l 2 an d the e- upper regularization by ||a||[y| = (i/||a||* 2 + e|a| 2 )*. Here 
|| • ||* denotes the dual norm. Obviously, on the level of the Hessians this means y^ej (£) = +el 
and ^* e -| (a) = g*{a) + el where of course g*(a) is the Hessian of || • ||* 2 /2 at the point a. Recall 
also that g*(a) is the inverse of g(J*(a)). Moreover, we define the e-regularization <7[ e ](£) of the 
matrix y(£) by 

9[e](0= {9(0+ ^) ° (1 + ^(0) _1 - 

If we define in a similar way g* e i(ce) then it will be inverse to g^(J*(a)). Obviously, for each 
e>0 

ff[e](0>el, ff[ £ ](«)>el 
(in the sense of quadratic forms) for all £ and a. Finally, let us put 

ll£ll[e] = yjt-9[e](0 IHI* e ] = \j a ' 9[e}{a) -a. 

Then || • ||[ e ] and || • ||* e j are regular Minkowski norms, dual to each other. As e goes to zero, they 
approximate the original norm || • || and its dual, respectively. 

1.4 Gradient Vectors and Distance 

For a weakly differentiable function u : M — » R, define its gradient vector by 

Vu(x) := J*(x,Du(x)) (1.10) 

for every regular x G M, where the derivative Du(x) G T*M is well-defined. In a local coordinate 
system, we have Du(x) = ]>^™ =1 (du/<9x*)(x)<ix* and 

Vu(x) = 9l 3 {x,Du{x))^(x)^-. 

We remark that the nonlinearity descends from the Legendre transform to the gradient vector, 
namely V(u + v) 7^ Vn + V« in general. For the same reason, at points x with Vn(x) = the 
gradient vector field Vti is in general not differentiable - even if (M, F) and u are smooth - but 
only continuous. 

We define the distance function d : M x M — > R + by 

d(x,y) := sup{ii(y) - u(x) : u G C X (M), F(z,Vu(z)) < 1 for all z G M). 
If F is C 2 on TM \ {0}, then this is equivalent to 

d(x,y) = inf j(%(7,7(t)) dt = inf (J* F 2 ( 7 ,j(t)) dt\ ' , 

where the infimum is taken over all differentiable curves 7 : [0, 1] — * M with 7(0) = x as well as 
7(1) = y. For fixed y G M, the distance function x 1— ► d(y,x) satisfies F(x, Vci(y, x)) = 1 for 
almost every x G M (more precisely, for all x G M \ ({y} U Cut(y)) where Cut(?/) being the cut 
locus of y, see Chapter [5]). Moreover, the distance function d has the following properties of a 
metric: 

• d(x, y) > for all x,y £ M and <i(x, y) = if and only if x = y; 

• d(x, z) < d(x, y) + d(y, z) for all x, y, z G M. 
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Note that in general d will not be symmetric. The function d (x, y) := d(y, x) will be the 
distance function for the reverse Finsler structure F of F. Locally d and d are comparable 
thanks to the uniform ellipticity (jl.2p . We define the forward and backward open balls as 

B + (x, r) := {y G M : d(x, y) < r}, B~(x, r) := {y G M : d(y, x) < r} 

for x G M and r > 0. Closed balls are defined similarly. 

Example 1.4 For each Minkowski space (M,F) = (R n , || • ||) we have d(x,y) = \\y — x\\. 

We say that a Finsler manifold (M, F) is forward complete if every forward Cauchy sequence is 
convergent. That is to say, if a sequence {x n }ng]N C M satisfies limjv^oo sup N<n<m d(x n , x m ) = 
0, then there exists a point x G M such that linij^oo x) = 0. By the Hopf-Rinow theorem 
(cf. |BCSI Theorem 6.6.1]), the forward completeness is equivalent to that every bounded forward 
closed ball is compact. We can similarly define the backward completeness which is nothing but 
the forward completeness of F . They are not equivalent because a forward Cauchy sequence 
may not be a backward Cauchy sequence. Nonetheless, the convergence linin^oo d{x n ,x) = is 
equivalent to lirm^oo d(x, x n ) = 0. 

Observe from the definition of the Legendre transform that Vu(x) points into the direction in 
which u increases the most. That is to say, 

„ / n n -,. u(y) — u(x) 
F(x,Vu(x)) =hmsup-^7 

y->x d(x,y) 

If u is C 1 and if F is C 2 on TM \ {0}, then we have 

I A 

F( 7 ,V(-«)(7))F(7,7)dt<«(7(0) -u( 7 (0)) < / F( 7 , Vn( 7 ))F(7, 7 ) dt 

Jo 

for any C 1 -curve 7 : [0, 1] — > M. Note the difference between V(— u) and -Vji. 



2 Finsler Laplacian 

Besides the Finsler structure F on M, throughout the paper, we fix a measure m on M. We 
always assume that this is locally bounded from above and below in terms of the volume form, 
i.e., each point x G M has a neighborhood U with a local coordinate system (x*)™ =1 such that 

m(dx) = e- y{x) dx l ■ ■ ■ dx n (2.1) 

for some bounded measurable function V : U — * R. 

Given a smooth vector field \& : M — > TM, we define its divergence div ^ : M — > R through the 
identity 

/ udw^f dm = - ^udm = - Du-^dm (2.2) 
./a/ •/ u Jm 

for all u G C£°(M), where Dm • ^ = Du(f) at x denotes the canonical pairing between T*M 

and T X M. If in local coordinates (x l )f =1 the measure m and the vector field ^ are given as 

m(dx) = e~ v ( x 'dx^ ■ ■ ■ dx n and ^f(x) = Y2i=l ^ % {x)(d / dx 1 ) with differentiable functions V and 

then we have 




The concept of div \I' extends in an obvious way to smooth vector fields defined on open subsets 
as well as to vector fields which are only weakly differentiable. 

Definition 2.1 A Finsler space is a triple (M,F,m) consisting of a smooth, finite dimensional 
manifold M, a Finsler structure F on M and a measure m on M as above. 
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Note that, in this setting, the gradient depends on F and the divergence on m. Both F and 
m can be chosen independently. The reason why we consider an arbitrary measure rather than 
constructive ones (such as the Busemann-Hausdorff and the Holmes-Thompson measures) will 
be explained in Chapter [5j 

Given an open set f2 C M, the energy functional £q : (Q) — ► [0, oo] on Q is defined by 

£ n ( u ) ■- - I F* 2 (x,Du{x)) m(dx) = - f F 2 (x,Vu(x)) m{dx). 
2 Jn 2 J n 

1/2 

We will suppress f2 if U = M, i.e., £ := £u- Clearly £q is convex and positively homogeneous. 
Note that this energy functional coincides with Cheeger's one [Ch] in terms of upper gradients 
(see also [Sliaj ). In order to make full use of Ricci curvature assumptions, this seems more 
suitable than the energy functional in terms of averaged difference quotients as studied for 
instance in [St 1 j or [KS] . The averaged energy incorporates a linearization of the operator (or 
the semigroup). However, the 'canonical' Laplacians and heat semigroups on Finsler manifolds 
are always nonlinear - except in the Riemannian case. See also |Ohl] for related work. 

Recall that the classes L 2 oc (fi) and H^ oc (fl) are defined solely in terms of the manifold structure 
of M (i.e., independent of the choices of F and m). Let H l {Q) := {u G H^ oc (Q) : £q(u) < oo} 
and Hq(U) be the closure of C^°(£l) (or, equivalently, H^(U)) in i? 1 (Q) with respect to the 
(Minkowski) norm ||u||#i := \\u\\ L 2 + £q(u) 1//2 . The dual space to Hq(£1) is denoted by .ff -1 (fi). 
Define the energy functional with Dirichlet boundary conditions £q : L 2 (Q) — ► [0, oo] by £q(u) := 
£fi(u) for u G Hq{Q) and £q(u) ■= +oo else. The ground state energy (inverse Poincare constant) 
is given by 

X n := M{2£ n (u) : u G H^Q), \\u\\ L 2 = 1} . 
If Xn = (e.g., if fl is compact) then it is more convenient to consider 

Xn ■= M^2£ n {u) : u G H^(Q), \\u\\ L 2 = 1, J udm = oj . (2.3) 

Lemma 2.2 (i) The energy functional £q is lower semicontinuous on L 2 (S7). 

(ii) If Q is relatively compact, then Hq(Q) is proper in the sense that every bounded sequence 
in (Hq(Q), II • Wh 1 ) contains a convergent subsequence. 

(iii) If Q is relatively compact and connected with non-polar boundary, then xn > 0. If M is 
compact, then Xm > 0- 

(iv) The functional £9, is K -convex on L 2 (Q,m) with K = xn^fi- Moreover, for each C G R 
it is K-convex on the convex set {u G L 2 (Q) : udm = C} with K = Xn K Q- 

Proof: (i) - (iii) are standard facts. In fact, we can reduce (ii) and (iii) to a Riemannian 

structure (bi-Lipschitz) equivalent to F. 

(iv) Recall that the dual version of (II. 5p states 

F* 2 (x, ^) < \f* 2 (x, a) + \f*\x, 0) - ^F* 2 (x, f3 - a) 

for all x G ft and a, (3 G T*M. Hence, 

£°a (^) < («) + («) - («-«). 

The last term can be estimated by 2£q(u — v) > xnll - " — v \\\,2 for all u,v G I? and by 
2£q(u — v) > Xn\\ u ~ v Wl2 if m addition J udm = J vdm. This proves the K- (and K-, re- 
spectively) convexity. ■ 
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We define the Finsler Laplacian A acting on functions u G H\ oc (£l) formally by An := div (Vu) 
(cf. |Sh2] . |BKJj ). To be more precise, Au is the distributional Laplacian defined through the 
identity 

/ vA.udm = — / Dv(Vu)dm 
Jn Jn 

for all v G flg(fi) (or, equivalently, for all v G C£°(f2)). Recall that at points x with Vti(x) = 
the function V« in general will be not differentiable (even if the function u itself and the norm 
F will be smooth). Note the sign convention: our Laplacian is a negative operator, i.e., 

uA.udm < 

n 

for all u G Hq(Q) (and equality holds if and only if u is constant a.e. on each connected 
component of O). The Finsler Laplacian is a linear operator if and only if F is a Riemannian 
structure (i.e., F(x, •) is a Hilbert norm for a.e. x G M). 

Given g G (Q), a function u G H\ oc (fl) is called a weak solution of Au = g in if 

Dv(Vu)dm= / vgdm 



for all v G ifp(ri). A function u G i^ c (f2) is said to be weakly harmonic on 17 if it is a weak 
solution of Au = in Q. 

Lemma 2.3 A function u G -ff|Q C (Sl) is weakly harmonic on Cl if and only if it is a minimizer 
of the energy functional £q/ on each open set $7' relatively compact in f2 ; i.e., 

£ n >(u) = M{£w(u + v) : v G £#(0')}- 

^4 function u G H l (Q) is weakly harmonic on f2 i/ and on/y i/ i£ is a minimizer of £q, i.e., 

£ n (u) = M{£ n {u + v) : u G i^)}. 

Proof: The first claim immediately follows from the calculation 

£ n ,(u + 8v)= [ -| fiF* 2 (x,,Dn(x)+^(x))^ m(cfe) 
Dv{Wu) dm, 

where for the second equation we used Lemma [l.lf ii). For the second claim, in addition we take 
the estimate J Q Dv{Wu) dm < 2£q(v) 1 ^ 2 £q(u) 1 / 2 into account. ■ 

We also introduce a weighted Laplacian associated with a Riemannian structure induced from the 
gradient vector field of some function. Given a function u G H^ QC {Q), we define the Riemannian 
tensor Q n M by 

g {u) {x) :=<7 Vu (z) ■■= g(x,Vu(x)) = [g tj {x,Vu{x))^_ (2.4) 

for each x G M where Vu(x) G T X M is well-defined and nonzero. Otherwise, we put g^ u \x) := 
g(x, Z{x)) for some fixed nonvanishing vector field Z on M. Note that its inverse is given by 

g^ix)" 1 := g* {u) {x) = g*(x,Du{x)). 

For each u G iTA (fl), define the weighted Laplacian A(") acting on functions w G (£1) in 
the sense of distributions by A^w := div (g*^Dw). Here 

5 *W(*)ZMx) = £ 4.(x,^(x))^(x) A G T X M. 
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Lemma 2.4 For any u E Hi oc (Q), we have Au = A^u in the sense of distributions on Q. 
More precisely, for all v € Hq(£1), it holds that 



vA^ u 'udm = / vA.il dm. 
n Jn 



Proof: Lemma ll.lf iii) , (jl.lOp and the definition of the divergence (|2.2p yield that 

/ vA^udm = - / Dv(g*^ Du) dm = - / Dv(Vu)dm = I vAudm. 
Jq Jn Jn Jn 



We will use the above lemma to show a generalized Laplacian comparison theorem in Theorem 
?2] as well as Corollary 15.31 Compare them with the following remark. 



Remark 2.5 Let (H n , || • \\,m) be a Minkowski space equipped with the Lebesgue measure, 
and put u(x) = f(\\x — y\\) for some nondecreasing C 2 -function / on R + and some fixed point 
y G IR™. Then we have, for any x ^ y, 

n — 1 

Au(x) = f"(\\x - y\\) + T^—^f'dlx - (2.5) 

ll-c y\\ 

In particular, A(||x — y\\ 2 ) = 2n. If / is nonincreasing, then an analogous result holds true for 
v(x) = f(\\y - x\\), namely 

71 — 1 

Av(x) = f"(\\y -x\\) + -——f'(\\ y - X [|). (2.6) 

\\y x\\ 

This is because, for nonincreasing /, the right-hand side of (|2.5|) coincides with —A(—u)(x) = 
Au(x), where A stands for the Finsler Laplacian for the reverse Finsler structure F(x,£) = 
|| — ^|| . Similarly, for nondecreasing /, the right-hand side of (|2.6p coincides with Av(x). 

Proof: We deduce from Lemma ll.lf ii) that 

Du{x) = f\\\x - y\\)D{\\x -y\\) = f '\) lx ~ f ] J(x - y). 

\\ x - y\\ 

For nondecreasing /, we find 

Au(x) = dlv r ( F f-fjb - v)) = M'T'f (» - y) 

V \\ x -y\\ J V \\ x -y\\ 

which implies the claim since X^ILi(^ll x — y\\/dx t )(x t — y' 1 ) = \\x — y\\ by Euler's theorem (cf. 
[BCSl Theorem 1.2.1]). ■ 



3 The Heat Equation — Global Solutions 

To simplify the presentation, we will assume throughout this chapter that the general assump- 
tions of the previous chapters are satisfied. That is, (M, F, m) is a Finsler space with a Finsler 
structure F satisfying (|1.2jl and a measure m satisfying (2.1). We remark, however, that instead 
of (jl.2p for the sequel it suffices to assume that F(x, •) is strictly convex and differentiable on 
T X M \ {0} for a.e. x. 
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Definition 3.1 We say that u is a global solution to the heat equation dtu = Au on [0, T] x M 
if u G L 2 ([0,T],i^(M)) n #*([(), T^fT^M)) and if, for every t G [0,T] and u G I#(M), it 
holds that 

/ vdtu t dm = — \ Dv(Vut)dm. (3-1) 

Here ut(x) := u(i, x). To be more precise, our global solutions are always global solutions with 
Dirichlet boundary conditions. 

Remark 3.2 (i) The condition u G L 2 ([0, T],Hq(M)) n H l (%T], H~ l (M)) implies that u G 
C([0,T],L 2 (M)) (see, e.g., [EH p. 287]). Equivalently we could require that the above identity 
(|3TT|> holds for all u G L 2 ([0, T], flJ(M)) and a.e. t G [0,T]. 

(ii) If M is compact, then every global solution u to the heat equation is mass preserving, i.e., 
J M utdm = J M uodm holds for all t. Indeed, choosing constant v = 1 G L 2 ([0, T], i/g (M)) as 
test function yields the claim. 

Now we are going to construct a global solution to the heat equation as gradient flow of the 
energy functional £° on L 2 (M). Since £ is a convex function on the Hilbert space L 2 (M), we can 
apply Crandall and Liggett's classical technique |CL] (see also |Ma] . [AGS] for generalizations 
to curved spaces). To simplify notation, we use £ instead of £° but take care to evaluate it only 
on Hq (M) . 

Given u G Hq(M), we define 

|V(-£)|(tt) := max (o, limsup - f ^ 

where v G Hq(M) and the convergence u — ► u is with respect to the L 2 -norm. Note that the 
convexity of £ implies that |V(— £ )\{u) = holds if and only if u is a minimizer of £ on Hq(M). 

Lemma 3.3 If < |V(— £)\{u) < oo, then there exists unique v G L 2 (M) satisfying ||i>||_£,2 = 
|V(— £)\{u) as well as 

£(u)—£(u + tv) . 
lim k ' | L = V -£ (« . 

Proof: Take a sequence {vijigw C ^d(M) \ {0} such that 

£ (u) — £ (u + Vi) . 
lim v ; ... „ v ^ = |V(-£)|(tt). 

We put Vi := (|V(— £ )\(,u)/\\vi\\i l 2) ■ Vi and deduce from the convexity of £ that 
]im lim £(u)-£(u + t Vl ) ^ ^ £{u)-£{u + v l ) = M _ mu) _ 

i^oo t[0 t\\ v i\\L 2 i^oo II^iIIl 2 

Thus we have lim^oo lim t ^ {£(u) —£(u+tvi)} /t\\vi ||^2 = |V(— £)\{u). Moreover, foranyi,j > 1, 
we see 



|V(-f)|(«) > lim 



£(u)-£(u + t(v i + v j )/2) 



tio t\\(vi + Vj)/2\\ L 2 

|V(-£)|(u) , f £{u) - £{u + tvi) £{u) - £{u + tvj) 
> — — - — hm < — h 



Vi + Vj)/2\\ L 2 t|o 1 2t||t>j|| L 2 2t||uj|| I ,2 

This implies that limjj^oo \\vi + Vj\\ L 2 = 2\V(—£)\(u). Hence {ujjieiN is a Cauchy sequence and 
converges to some v G L 2 (M). Uniqueness is deduced in a similar way. ■ 

We define V(— £)(u) := v using v G L 2 (M) as in Lemma f3.3l above and call V(—£)(u) the 
gradient vector of —£ at u. We simply set V(— £)(u) := G L 2 (M) if |V(— £)\(u) = 0. 
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For u G Hl{M) and 5 > 0, we denote by [^(uo) G H"o( M ) the 

unique minimizer of the function 

u -> £(„) + """ 2 7" i2 . (3.2) 

This can be regarded as a discrete approximation of a gradient flow of £ . In fact, (U t / n ) n (uo) 
converges to a continuous curve u : H+ — > Hq(M) with u(0) = iio as n goes to infinity, and 
ut := u(i) satisfies the following properties (see, e.g., [Ma, Theorem 1.13 & Section 2]): 

(i) The curve 1 1— > u t in L 2 (M) is locally Lipschitz continuous on (0, oo) and satisfies, for a.e. 
t > 0, 

lim 11 ^^"^ 11 ^ =|V(-g)|(«,). (3.3) 

In particular, we have |V(— £)\(ut) < oo at every t > 0. 

(ii) For a.e. t > 0, it holds that 

Thanks to (i) and (ii) above, a similar discussion to the proof of Lemma 13.31 ensures that, for 
a.e. t > 0, 

u t +s ~ ut 



lim 

<5->0 



V(-S)(u t ) 



= 0. (3.5) 

L 2 



In other words, <9tUt = V(— £)(ut) m the weak sense. If we replace the limit with the right limit 
lim^oi then equations (|3.3|) and (|3.4[) hold for all t > and (|3.5|) holds for all t > 0. In addition, 
we find 

Hm h ±c ^Mb =o 

for all t > along the same lines as [Oh2l Lemma 6.4]. 

Theorem 3.4 For each uq G Hq(M) and T > 0, i/iere eziste a global solution u to the heat 
equation which lies in L 2 ([0, T], H%(M)) n H l ([0, T], L 2 (M)). Moreover, for each t E (0,T), t/ie 
distributional Laplacian Aii£ is absolutely continuous with respect to m and its density function 
is V(— £ )(ut)- In particular, dtUt = Atif in the weak sense {see (j 5|) ) and we have 

^^'s^ = l V (- f )l^) 2 = l|V(-£T)(^)][i 2 = ||A^||i 2 (3.7) 

for all t > 0. 

Proof: Let it : H + — ► -ffg C^0 be the gradient curve of £ constructed as the limit curve of the 
discrete approximation (|3.2p . Note that £{ut) 2 dt < T£(uq) 2 < oo and 

||d t « t || 2 2 dt = f \V(-£)\(u t ) 2 dt = £(u )-£(u T ) <oo. 

JO 

Thus we observe u G L 2 ([0, T],H^(M)) n ^([0, T],L 2 (M)). 

We shall show that, for any v G L 2 ([0, T], flJ(M)) n fl' 1 ([0,T],L 2 (M)) and < t < ti < T, it 
holds that 

/ v tl u tl dm- [ v to u k) dm= [ [ \^u t - Dv t (Vut)\ dmdt. (3.8) 
JM Jm Jto Jm I Ct ) 

Fix i € (£o>ii)- Given small 5, e > 0, consider unique w\ := Us(ut) minimizing the function 
w i ^ £ (ty) + ||i<; — tif ||^ 2 /2(5 and put u;^ £ := w\ + evp Then the choice of w\ yields 
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Firstly, we have 

~ u t\\h ~ \\ w t ~ u t\\L2 = 2e ( w s ~ u t,v t ) L 2 + e 2 \\v t \\ 2 L 2, 



and hence 



■r.t „. ii 2 



hm = 2{w 5 - u t , v t ) L i 



2< / vtw\dm 
I J M 



vtUt dm \ . 

M 



Secondly, it follows from Lemma ll.lf ii) that 
S(wi £ )-E{wl) i 



lim — ^ — - = lim — f {F^iDnA + eDvA - F* 2 (Dwl)\ dm 

£->o r e-^o 2e y M 

D!) 4 (Vi«J) <im. 



Ai 



Therefore 



liminf— < / v+w^dm— [ VtUtdm} > — lim / DvA^wV) dm 

Dvt(Viit) dm. 



M 



The second equality follows from the choice of w\. In fact, {DwJJjxj is a Cauchy sequence 
converging to Dut as 5 tends to zero. Together with (j3.6l) . we have 



liminf-<j / v t +sUt+sdm— I vtUtdm 
IM JM 



54.0 8 



> 



liminf-jr! / (v t +s ~ v t )u t+ s dm + [ v t u t+ sdm- [ v t u t dm\ 
Ho 8{J M y J M J M J 

/ —r^Ut dm + lim inf - { f Vtw\ dm — [ v+Ut dm 
Jm dt 8io 5{J M J M 

L{^ ut - Dvt(vut) } dm - 



We obtain the reverse inequality by exchanging v with —v, and complete the proof of (13. 8J) . 
By virtue of (|3.5p . choosing time independent v £ Hq(M) in (|3.8p shows that 

/ wV(— £){ut) dm = lim / ^ t+<5 dm = — I Dv(Wut)dm= / vAiifdm. 

Jm s i° Jm 8 Jm Jm 

Hence Ai^ is absolutely continuous with respect to m and the density function is nothing but 
V{-£)(u t ). m 

The following proposition ensures that the gradient flow constructed as above is actually a 
unique solution to the heat equation. In particular, for each uq £ L 2 (M), it allows to construct 
a unique gradient curve (ut)t>o starting from uq as the limit of a sequence of gradient curves 
(u[ n ^)t>o C Hq(M) such that tends to uq in L 2 (M) as n goes to infinity. We denote this 
curve by (P*uo)t>o- The map : u h- ► P t u defines a (non-expanding) semigroup of nonlinear 
operators on L 2 (M), and we call it the /teat semigroup. 

Proposition 3.5 For aZZ global solutions u,v to the heat equation, we have 

dt{Uut\\l^j =-2£(u t ), (3.9) 

dt Qlh - ut||£a) < -2K M £(u t -v t )<0 (3.10) 
wii/i km as introduced in fll.^p. 



12 



Proof: Assuming that both u and v are global solutions to the heat equation and choosing 
u — v as test function (for each of these solutions) yields 



<h ( 7; IK - vtWtf J = I (ut - ii)(0,tii - <),(-,) dm 



j (Du t - Dv t ){Vu t - Vv t ) dm. 
Jm 



In the case v = the last term obviously coincides with — 2£ (ut) which proves the first claim. 
In the general case, the last term of the previous identities can be estimated from above according 
to Lemma ll.lf v) which asserts 

(Dut - Dv t )(Vu t - Vv t ) > K M F* 2 (Du t - Dv t ). 

This proves the second claim. ■ 

Corollary 3.6 For all global solutions u, v to the heat equation, we have 

IK - v t \\ L 2 < e ~ KMXMt ■ |K - v \\ L 2. 
If in addition J M uodm = J M vodm, then in the estimate above xm can be replaced byx M , i.e., 

IK - "tllxa < e~ KMYut ■ \\uq - v \\ L 2; 

if v = then km can be replaced by 1, i.e., 

IKIK < e~ XMt ■ \\uq\\ L 2. 

Proof: The estimates follow immediately from Proposition 13.51 (together with the definition of 
Xm and Xm) an d an application of Gronwall's lemma. ■ 

The previous are the usual contraction properties of gradient flows, for £° is /-CMXM-convex on 
the Hilbert space L 2 (M) (Lemma I2.2( iv)). Recall that compactness of M will imply km > 
and Xm > 0. A slightly modified argument will yield contraction in L P {M) for each p. 

Theorem 3.7 For all p G [l,oo] and all global solutions u,v to the heat equation, we have 

( 4(p-l) \ „ 

IK ~ v t\\LP < exp I -3 KMXMt I • ||U - V \\lp- 

If v = then in the estimate above k m can be replaced by 1 . 

Proof: Assume 1 < p < 00. (The cases p = 1 and p = 00 follow by approximation.) Moreover, 
assume ut — vt G L 2 (M) n IP{M). Then a slight modification of the proof of the previous 
proposition yields 

- -d t J \u t - v t \ p dm = - J K - -vt| p_1 sign(u 4 - v t )d t (u t - v t ) dm 
= j D (K - sign(-Ui - v t )) (Vu t - Vv t ) dm 

= {p - 1) J \u t - v t \ p ~ 2 D(ut - v t ) {Vu t - Vv t ) dm 
>(p-l)KMj\u t -v t r 2 D(u t - Vt ).V(u t -v t )dm 

= 4j(p - 1)k m J D(\u t - v t \ p/2 sign(u t - v t )) ■ V(K ~ v t \ p/2 sign(n t - v t )) dm 
> \{p- 1)kmXm J K - v t\ P dm. 
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To be rigorous, one should assume in the previous argumentation that \u — v\ is bounded from 
above if p > 2 or bounded away from if p < 2, respectively. To overcome this restriction, one 
can approximate u and v by bounded solutions to the heat equation in the case p > 2. In the 
case p < 2, one can approximate \u — v\ by ((u — v) 2 + e 2 ) 1 / 2 . 

Obviously, the assumption v = allows to replace km in the first inequality above by 1. The 
claim follows again by an application of Gronwall's lemma. ■ 

Now let us switch from contraction estimates to integrated Gaussian estimates for the heat 
semigroup. A preliminary step is the following: 

Lemma 3.8 Let u be a global solution to the heat equation dtu = An on M and ip : M — > H be 
a Lipschitz continuous function of bounded gradient F(x, 'Vip(x)) < C for all x S M. Then we 
have, for all < s < t, 

\\e^u t \\ L 2<e c2 ^\\e^u s \\ L 2. (3.11) 
Proof: Straightforward calculations yield 

-||e^U(||^2 — -||e^u s || 2 2 = / / e 2 ^u r d r u r dmdr 
2 2 J s J M 

= - [ [ D(e 2lp u r )(Vu r )dmdr 
Js J M 

{e 2 ^Du r (Vu r ) + 2u r e 2V, L>V(Vu r )} dmdr 
<[ f {- F 2 {Vu r )e 2 ^ + 2F(Vu r )F*{DiP)u r e 2 ^}dmdr 

Js J M 

< [ [ F 2 (y^)u 2 e 2lp dmdr<C 2 [ \\e^u r \\ 2 L2 dr. 

Js J M Js 

Together with Gronwall's lemma, this implies the desired estimate. ■ 



Theorem 3.9 (Integrated Gaussian Estimates a la Davies) For every u, v € L (M), we 
have 

( d 2 (v,u)\ 

uPtv dm < exp I I \\u\\ L 2 \\v\\ L 2, (3.12) 

1 m \ 4i / 

where d(v,u) = essinf{<i(y, x) : x G supp[n], y £ supp[i>]}. 

Proof: Given u and v, apply Lemma 13.81 to the function ip(x) = Cd(v,x), where d(v,x) := 
essinf{<i(y, x) : y G supp[u]} and C > is a constant to be fixed below. Then 

uP t vdm < ||e^Ptu[| £ 3||e~^«[| £( a < e cH \\e*v\\ L 2 We'^uW^ 

M 

^ C 2 t-Cd(v,u)\\„,\\ ||„,|| 

— e IpIIl 2 II^IIl 2 • 

Choosing C = d(v,u)/2t now yields the claim. ■ 



t 

s JM 



4 The Heat Equation — Local Solutions 

This chapter is devoted to studying the local regularity of solutions to the heat equation. For- 
mulation of results and proofs follow classical lines. For the elliptic case, similar results have 
already been derived by Shen |Sh2| and by Belloni, Kawohl and Juutinen |BK Jj . See also |Dij . 

Throughout the chapter, the assumptions (|1.2p and (2.1) will be in force. 
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Definition 4.1 Given an open subset C M and an open interval I C R, we say that a real 
function u on I x£l is a local solution to the heat equation dtu = An on I x 17 if u G L 2 oc (I x 17) 
withF*(Du) G Lf oc (IxQ) and for every smooth, compactly supported v on / x 17 (or, equivalently, 
for every v G i?c(/ x ^)) 



utdtVtdmdt = \ \ Dvt(Vut) dmdt. (4-1) 
/JQ J I JO, 

Remark 4.2 A function u being a local solution to the heat equation implies that C\u + C2 is 
a local solution for every C\ G R+ and every C2 G R. In particular, constants are local solutions 
to the heat equation. In general, it will not imply that — u is a local solution. 

Example 4.3 Let || • || be any smooth, strictly convex Minkowski norm on R n , put F(x, •) = || • || 
for all x and choose m to be the Lebesgue measure. Then for each fixed y G R n the function 

u{t,x) = t~ n/2 exp(-\\y - x\\ 2 /4t) (4.2) 

is a local solution to the heat equation dtu = An on R + x R n . More generally, u(t,x) = 
f{t, ||y — x||) is a local solution to the heat equation for each smooth function / : R^_ — > R 
satisfying d r f(t,r) < and 

d 2 r f(t, r) + ^=-^/(t, r) = dtf(t, r), d r f(t, 0) = 0. (4.3) 

If / satisfies d r f(t,r) > and (|4.3|> . then the function v(t,x) = f(t, \\x — y\\) is a local solution 
to the heat equation. If || • || is even a norm (i.e., if in addition it is symmetric), then the latter 
holds true without any restriction on the sign of d r f{t,r). 

Note that the function u in (|4.2p is C 2 in the space variable at x = y if and only if || • || is a 
Hilbert norm. 

Proposition 4.4 (Harnack Inequality) Every local solution to the heat equation dfU = An 
on I x 17 is Holder continuous (more precisely, it is almost everywhere equal to a Holder contin- 
uous function) . 

Continuous local solutions satisfy the parabolic Harnack inequality and the strong maximum 
principle. 

Proof: Since for given u the Finsler Laplacian An coincides with the weighted Laplacian A^") 
in the Riemannian metric derived from Z = Vn (Lemma 12. 4|) and since for varying (and time- 
dependent) n all these possible operators A^"^ are 'locally uniformly elliptic', the claim is an 
immediate consequence of Saloff-Coste's result |Salj for locally uniformly elliptic operators on 
weighted Riemannian manifolds. ■ 



Proposition 4.5 The distributional time derivative w = dtu of any continuous local solution 
to the heat equation dtu = An on I x 17 lies in H^ oc (M) and admits a Holder continuous version 
(which satisfies the parabolic Harnack inequality and the strong maximum principle). It is a 
weak solution to the linear parabolic PDE 

d t w = div(g*^Dw) 

with the locally uniformly elliptic, time dependent matrix g*^ = g^ 1 defined in 
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Proof: We postpone the technical proof for the fact dtu G H^ oc (M) to Appendix 18.21 and take 
this fact now for granted. Let $ be a smooth, compactly supported test function on I x 17. 
Applying (14. ip to v = <9t<3? yields 



(dt$)w dm dt = J J vdtudmdt 

= - j JdvJ*(Du) dmdt = j J D<f> ■ d t [J* (Du)] dm dt 

= J J D<$> ■ g*(Du) ■ D(d t u)dmdt = J j D<f> ■ g*^ ■ Dw dmdt. 

Hence, w is a weak solution to the linear PDE. Regularity theory for solutions to linear second 
order PDEs now implies that w has a Holder continuous version satisfying Harnack's inequality 
and strong maximum principle. ■ 



In order to obtain higher order regularity results, we have to impose certain minimal smoothness 
assumptions on the data F and m. We will assume that the maps J*(x, a) and the logarithmic 
derivative —V{x) = log [m (dx) /dx] of the measure m are Lipschitz continuous in x. More 
precisely, 

we assume from now on that for each point x E M there exists a local coordinate system 
(x*)f =1 on a suitable neighborhood U of x and a number A such that 

\j* ki (x,a)\ <AF*(x,a), \vk(x)\ < A (4.4) 

for almost all x G U and all a S T*M. Here and henceforth 

and rjk(x) := dV/dx k (x) where m(dx) = e~ v ^dx\ ■ ■ ■ dx n . 
The first important consequence of these assumptions is 

Theorem 4.6 (i7 2 -Regularity) Assume that the transfer maps J* as well as the logarithmic 
density of the measure m are differ entiable in x as specified in < \4-4h Then every continuous 
local solution to the heat equation dtu = An on I x Cl is H? in x. 

We postpone the technical proof to Appendix 18.31 and continue with the proof of Holder conti- 
nuity of the derivatives of u. 

Lemma 4.7 For each local solution u to the heat equation and each k = l,...,n, the partial 
derivative w(t,x) = D^u{t,x) = du/dx k (t,x) is a weak solution to the equation 

d t w = div(g* {u) • Dw) +divH + h (4.5) 

with a vector field H G L^ oc (I x f2) and a function h G L?° JI x Q) given by 

Hi(t,x) =fl i (x,Du(t,x)) - T] k (x)J*(x,Du(t,x)) 

and 

h(t,x) = r] k (x)d t u(t,x). 
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Proof: Let a smooth, compactly supported test function $ on I x $7 be given. Without 
restriction, we may assume that there exists a global coordinate system (x*)™ =1 on Q (or at least 
on the support of 3>). In these coordinates, let m be given as m(dx) = e~ v<yX ^dx l ■ ■ ■ dx n . 
Applying (jlTTj) to v = D k & yields 

J j ' dt(D k $)udmdt = J J D(D k $) ■ J*(Du) dmdt 

= J J D® [- D k (j*{Du)) + (D k V)J*(Du)} dmdt 

= J J D<$>- l-g*(Du) • D{D k u) - ^t){Du) + (D k V)J*(Du)}dmdt 

= - j j D$ ■ [g* (Du) ■ Dw + H] dmdt. 
On the other hand, 

J j ' d t (D k $)udmdt = j f[-(d t <5>)(D k u) + (D k V)(d t <5>)u]dmdt 

[(d t $)w + $h] dm dt. 



That is, 

J J D$- [g* (Du) ■ Dw + H]dmdt = J f[(d t $)w + $h] dm dt 
for all smooth compactly supported on / x Q and thus 

d t w = diw(g* {u) -Dw + H) + h 
locally in distributional sense on / x Vt. ■ 

Lemma 4.8 (i) If w G L\ oc (I X fi) is a weak solution to the equation fl^.o] ) with a vector field 
H G Lf oc (I x 0) for some p G [1, oo] and a function h G L\£ C (I x ^en u; G Lf oc (I x ^) 
/or g = pn/(n — 2). 

(ii) If w is a weak solution to the equation < \4-°l) u>ii/i a vector field H G I>\ QC (I x ^) /or some 
p > n + 2 and a function h G L^ C (I x £aen u; is Holder continuous. 

Proof: (i) This result should be well known (perhaps even in a sharper version). Since we 
could not find a reference, we include a sketch of the proof. We do not discuss smoothing and 
cut-off arguments. For simplicity, we assume that /x!l = (0,T)x¥ and that M is compact. 
Let w G L p (I x 0) be a weak solution to the equation (|4.5p with a vector field if G L p oc (I x 0) 
and a function /i G L\^ C (I X 17). Choose u; p_1 as a test function. Then the weak formulation of 
(|4.5p implies 

HI^'tIIlp II^oIIlp = / -dt\\wt\\ p LP dt = j j uf^dtw dmdt 

P V Jo V Jo Jm 

[—D(w p ~ 1 ) ■ g*^ ■ Dw - D{w p - 1 )- H + w p ~ l h] dmdt 
V -^- j J w p - 2 Dw ■ g* {u) ■ Dwdmdt 



< 



+ ^y- j j w p - 2 H ■ g {u) ■ Hdmdt+ I / vf^hdmdt 
2(p-l) ||i? , (u)( ^ p/2)|| 2 2 



P 

+ ^\M% 2 \\f {u) (h)\\1» + HVWHlo* 

< C < oo 
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according to our assumptions on u, H and h. From this estimate, we first of all deduce that 
1 1 wtll lp is bounded in t on /. Having this at hand, we secondly deduce that 

\\F< u \Dw p/2 )\\ L 2 < oo. 

Classical Sobolev inequality now implies ufl 2 G L T with 2* = 2n/(n - 2). That is, w e L q with 
q = pnj (n — 2). 

(ii) This is a standard estimate. In the required version it can be found in [Sal] . However, 
similar versions certainly had been known much earlier, e.g., in the works of Moser, Aronson 
and Serrin. ■ 



Theorem 4.9 ((^'"-Regularity) Assume that the transfer maps J* as well as the logarithmic 
density of the measure m are differ entiable in x as specified in l\4-4h Then every continuous 



local solution to the heat equation dtu = An on I x Q is C l,a in t and x. 

Proof: To deduce the Holder continuity, we apply the first assertion of Lemma 14.81 to each 
of the partial derivatives w = D^u of the given solution u. It implies that w G L^ oc for some 
p > 2 and thus in turn H G L^ oc (according to our assumptions (14. 4p on the coefficients of the 
Finsler structure). Finitely many iterations of this argument yield w G L? for q sufficiently 
large in order to apply the second assertion of Lemma f4.8l which then implies Holder continuity. ■ 



Remark 4.10 If F is a smooth Finsler structure and if the logarithmic density of the measure 
m is C°°, then local solutions u of the heat equation dtu = An are C°° in t and x outside the 
set {(t,x) : Du(t,x) = 0}. On this set, however, the solutions typically will not be C 2 . See 
Example 14.31 

5 Ricci Curvature and Heat Equation 

From now on, we always assume that M is compact and that F is smooth on TM \ {0}. This 
in particular implies that the uniform ellipticity condition formulated in Chapter [1] is equivalent 
to the strong convexity of F{x, •) at every x G M (in the sense that the matrix gij(x,£) in (jl.ip 
is positive-definite for all £ G T X M \ {0}). 

We review some geometric concepts in a heuristic way, intended for nonspecialists. For further 
reading and more details, we refer to [BCS] and [Sh3j. 

A C 1 -curve 7 : [0, 1] — > M is called a geodesic if it has constant speed (i.e., ^(7,7) is constant) 
and if it is locally minimizing, i.e., given t G [0,1], there is e > such that d{ r y(s), 7(5')) = 
j* s s F(j, 7) dr holds for all s, s' G [0, 1] fl [t — e, t + e] with s < s'. Such 7 is in fact C°° and, for 
any x G M and y G M sufficiently close to x, there is a unique minimal geodesic 7 : [0, 1] — ► M 
with 7(0) = x and 7(1) = y (i.e., d(x, y) = f Q F(j, 7) dr). 

Given x G M and £ G T X M, we define the exponential map by exp x ^ := 7(1) provided there 
exists a geodesic 7 : [0, 1] — » M with 7(0) = x and 7(0) = £. By the Hopf-Rinow theorem (cf. 
|BCSt Theorem 6.6.1]), (M,F) is forward complete if and only if exp^, is defined on all of T X M 
for each (or some) x G M. In this case, any two points x,y G M can be connected by a minimal 
geodesic from x to y. 

For a unit vector i> G T X M, let r(i>) G (0, 00] be the supremum of r > such that the geodesic 
t 1 — > exp x tv is minimal on [0,r]. If r(v) < 00, then exp 2 .(r(w)f ) is called a cii£ point of x, and 
the cut locus Cut(x) of x is defined as the set of all cut points of x. The exponential map exp x 
is a C°°-diffeomorphism from {tv : v G T X M, = 1, t G (0,r(v))} to M \ (Cut (a;) U {x}). 

Fix a unit vector £ G T X M (i.e., F(x,£) = 1) and let Z be an arbitrary C°°-vector field on an 
open neighborhood U of x with Z(x) = £ and such that every integral curve of Z is a geodesic. 
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A typical example is Z = V[rf(7(— e), •)] for sufficiently small e > 0, where 7 : [— e, e] — > M is 
a geodesic with 7(0) = £. Then Z induces the Riemannian structure gz{x) := g(x,Z(x)) on 
[/ through (II. ip (see also (ll.9p ). and the /fag curvature /C(^, 77) of £ and a linearly independent 
unit vector 77 G T^M is defined as the sectional curvature of the plane spanned by £ and r\ with 
respect to #z (see |Sh3l Proposition 6.2.2]). Similarly, the Ricci curvature Ric(£) is the Ricci 
curvature of £ with respect to gz- 

Recall our arbitrarily fixed measure m on M and its representation m(dx) = e~ v ( x >dx l ■ ■ ■ dx n 
(see (I2.ip ). Similarly, the Riemannian volume element mz induced from gz has a representation 
as 

m z (dx) = e~ Wz{x) dx x ■■■dx n 

for some function Wz on U. Thus we can represent m(dx) = e~ Vz ^mz(dx) with mz as a 
reference measure and Vz = V — Wz as a weight function. We put 

Vz[-y(t)), (5.1) 

where 7 : [— e, e] — ► M is the geodesic with 7(0) = £. The important observation now is that for 
given £ the quantities Ric(£) := Kic gz (Z, Z) as well as d^V^ := d^Vz and d^Vje := dfVz do not 
depend on the choice of the vector field Z (provided it has geodesies as integral curves). 
The following lower Ricci curvature bound was introduced in [Qh4j inspired by the theory of 
weighted Riemannian manifolds. 

Definition 5.1 Let (M,F,m) be a smooth, n-dimensional Finsler manifold endowed with a 
smooth measure m and let K G R. 

(i) We say that (M, F, m) satisfies the bound n-Ric > K if Ric(£) > K and d^V^ = for any 
unit vector £ G T X M. 

(ii) We say that (M, F, m) satisfies the bound iV-Ric > K for some given number N £ (n, 00) 
if 

Ri C7V (£) := Ric(0 + - > K 

for any unit vector £ G T X M. 

(iii) We say that (M,F,m) satisfies the bound oo-Ric > K if RiCoo(£) := Ric(f) + > if 
for any unit vector £ G T X M. 

The infinite dimensional case (iii) corresponds to the Bakry-Emery tensor ([BE]) and the finite 
dimensional case (ii) is an analogue of Qian's generalized one ( |Qi| , see also |Loj ) . The most 
restricted case (i) still admits a number of non-Riemannian spaces. For instance, the Busemann- 
Hausdorff measure on a Finsler manifold of Berwald type satisfies dV = ( [Shi} Propositions 2.6, 
2.7]). However, the existence of a measure satisfying dV = should be a strong constraint among 
general Finsler manifolds, and then there is no advantage in dealing with concrete measures. 
This is the reason why we consider an arbitrary measure m on M. 



diV z 



a 
dt 



djV z 



a 



Theorem 5.2 Assume that iV-Ric > K for some pair K,jV £ E with N > dimM. Then the 
Laplacian of the distance function u(x) = d(z, x) from any given point z G M can be estimated 
as follows 

Au(x) < y/-(N-l)K ■ coth (JE?ld(z, x)\ (5.2) 

pointwise on M z := M \ ({z} U Cut(z)) and in the sense of distributions on M \ {z}. If 
K = 0, then the RHS should be interpreted as (N — l)/d(z, x); if K > 0, then as y (N — 1)K ■ 
coti^ K/{N- l)d(z,x)). 
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Proof: Let us fix z S M and put u(x) = d(z,x). Then outside of M z the vector field 
Z(x) := Vm(i) is well-defined, smooth and satisfies F(x, Z(x)) = 1. Let dz and Az de- 
note the Riemannian distance and the weighted Laplacian on M z with the Riemannian metric 
gz(x) := g(x,Z(x)). Then u(x) = dz(z,x) and Au(x) = Azu(x) by Lemma [2.4i Hence, 
estimating the Finsler Laplacian of the Finsler distance amounts to estimating the weighted 
Riemannian Laplacian of the Riemannian distance function. 

Due to our curvature assumption on the Finsler space (M, F, m) , the weighted Riemannian space 
(M z , g z ,m) satisfies the curvature bound iV-Ric > K in the sense of Definition l5.il On weighted 
Riemannian spaces, the latter is known to be equivalent to a generalized Bochner inequality or 
T2-inequality in the sense of Bakry-Emery 

T 2 (v,v)>^(A z v) 2 +K-r(v,v) (5.3) 

for all smooth functions v on M z . Here T(v,w) = Dv(Vzw) and 

T 2 {v , v) = ^A z T(v, v) - T{A z v, v), 



see |BEj . |Qi| , |Lo| . The remarkable observation of Bakry and Qian |BQ| is the 'self-improving 



property' of (|5.3p saying that the validity of the previous estimate (for all smooth v) entails the 
stronger estimate 



T 2 (v,v) > \-{A z v) 2 + K -T(v,v) + A 



A z v _ T{v,T{v,v)y ~ 
N 2Y{v,v) 



valid for all smooth functions v with nonvanishing gradient. Applying the latter to u{x) = d(z, x) 
and using the fact that T(n, u) = 1 yields 

r 2 (u,«) > -^-^{Azuf + K (5.4) 

on M x , where T 2 (u,u) = -D(A z u)(V z u) = -D{A z u){Z). 

Now let 7 : [0, 1) — > M be any minimizing, unit speed geodesic in (M, F) emanating from z. 
Then d(z,-y t ) = t and j t = Z{^ t ). Put </> t = Au(j t ) for t £ (0,1). Then ([531) together with 
Lemma 12.41 states 

on (0,/). Comparison results for ODEs then imply 

4>t < V(N-1)K ■ cot (]f^^(t + t )j 

for some to < (and the usual interpretation of the RHS if K < 0). Local asymptotic for small 
t implies to = 0. This proves the claim on the pointwise estimate of the Laplacian on M z . 

The extension to a distributional inequality, valid also on the cut locus, follows by the well- 
known Calabi argument. ■ 



Corollary 5.3 Assume that iV-Ric > K and let u(x) = f(d(z,x)) for some nondecreasing 
smooth function f : (0, oo) — > H. Then on M z , 



Au(x) < f"(d(z, x)) + f'{d(z, x)) V(N-1)K ■ cot U ^J^d(z, x) 



(5.5) 
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(if K > 0, with the appropriate modification on the right-hand side for K < 0). Similarly, if 
v(x) = h(d(x,z)) for some nonincreasing smooth function h : (0,oo) — > K,. T/ien on M 2 , 



Av(x) > h"(d(x, z)) + h'(d(x, z)) y/(N-l)K ■ cot (J J^j d ( x , z ) 



(5.6) 



In both cases, the estimates extend to hold in the sense of distributions on all of M \ {z}. 

If the function f has a smooth extension to [0, oo) with f'(0) = 0, then the inequality (15.51) holds 

on all of M in the sense of distributions. Analogously for l\5.6]) provided h'(0) = 0. 

Proof: The first claim follows from Theorem 15.21 by simple application of the chain rule: 

Af(u) = div(V/(n)) = div(/'(u)Vu) = f'(u)Au + f"(u)Du(Vu) 

and the fact that Dn(Vti) = 1. 

For the second claim, a similar argumentation with v(x) = d(x,z) yields 

Ah(v) = div(Vh(v)) = div( - h'(v)V(-v)) 

= h'(v)(- A(-v)) + h"(v) ■ D(-v)(V(-v)). 

Observing that v(x) = d(z,x), D(— u)(V(— v)) = DvCVv) = 1 and (-A(-v)) = Av, the 
claim follows as before since the bound iV-Ric > K for (M, F, m) implies the same bound for 
the Finsler space with reverse structure (M, F ,m). 

It remains to prove that (|5.5[) holds at the origin in the sense of distributions provided f'(0) = 0. 
Without restriction, we may assume f(r) = r 2 . (Otherwise, choose smooth g with f(r) = g(r 2 ) 
and use chain rule.) Obviously, for u(x) = d 2 (z,x), the distribution An assigns no mass to the 
origin. (Choose tp £ (x) = (1 — e~ 2 d 2 (z, x)) + as test function.) ■ 



Corollary 5.4 Assume that iV-Ric > K and let h = h(t,r) be a smooth solution to the PDE 

d t h = d 2 h + d r hy/(N - l)K ■ cot (y^3T^) (5-7) 

on (0, oo) x (0, L) (if K > 0, with the appropriate modification on the right-hand side for 
K < 0), where L = ir^ (N - l)/K if K > and L = oo else. A ssume in addition d r h < 
on (0, oo) x (0, L) and d r h = on (0, oo) x {0}. Then for any z £ M the function u(t,x) = 
h(t,d(x, z)) is a subsolution to the heat equation on M. That is, dtu < An in the sense of 
distributions on (0, oo) x M\ {z}. 

Example 5.5 (i) Assume that iV-Ric > 0. Then for any z G M the function 

n(t,x)=^/ 2 exp(-^^ 

is a subsolution to the heat equation on M. 
(ii) Assume that 3-Ric > — 2. Then for any z G M the function 

. . o/n d(x,z) ( d 2 (x,z) 

u(t, x) = t~ 3/2 . ; exp -t 



shih(d(x,z)) \ At 
is a subsolution to the heat equation on M. 
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Theorem 5.6 (Cheeger-Yau Estimate) Assume iV-Ric > K for some pair K, N G 11 with 
N > dimM and let u be a solution to the heat equation on [0, oo) x M with u(0, •) > ho(d(-, z)) 
for some z £ M and some smooth decreasing function ho on [0,L). Then 

u(t,x)>h K ' N (t,d(x,z)) (5.8) 

for allt > and x G M where h ' denotes the solution to the PDE (5.1) with initial condition 
h K ' N (0,-) = ho and Neumann boundary condition d r h K ' 0) = 0. 

Proof: We first observe that d r ho < implies d r h K ' N (t, •) < for all t > 0. Then the claim 
follows from the parabolic maximum principle along with Corollary 15.41 ■ 

Next, we are going to apply the above estimate to the 'fundamental solution' for the heat 
equation on M . What we have in mind is to study pt(x, z) = Pt5 z (x), the solution to the heat 
equation with initial data 5 Z . Unfortunately, Pt5 z is not defined since our heat semigroup only 
acts on L 2 (M) (or on Ui< p <oo^ P (^)' see Theorem 13. 7p . but - until now - not on measures. 
We thus will define pt(x, z) via approximation of the initial data 5 Z . 
For this purpose, let 

m(B~(z, r)) 



p(z) = lim 



71 



withc n := W 2 /T(n/2+l) being the volume of the n-dimensional Euclidean unit sphere. Recall 
that B~(z, r) = {x £ M : d(x, z) < r} denotes the backward open ball in M. Given K G II and 
n G M let p^ ,n {r) denote the unique solution of the above PDE (|5.7p with p^ ,n (r)dr — > 5o(dr) 
weakly as t — » 0. Recall that for each fixed C i n the model space M. K ' n of dimension n and 
constant sectional curvature K/(n — 1) the function (t, £) i— > p t ' n (d(£, £)) is a solution of the 
heat equation on M^' n . 

Theorem 5.7 Assume that the Finsler space (M,F,m) is compact and satisfies n-Ric > K for 
some K G II {with n being the dimension of M). 

(i) For all t > and all x,z G M 

Pt{x,z) := —3— limP t _ e u e (x) 
p(Z) e^0 

exists as a monotone limit with u £ (x) := p^' n {d(x, z)). 

(ii) For each z G M the function (t, x) i— > pt(x, z) is a solution to the heat equation on (0, oo) x 
M with pt(x, z)m(dx) — > 8 z {dx) weakly in the sense of measures as t — > 0. 



(iii) For all t > and all x,z G M 



Pt(x,z) > -^p? ,n (d(x,z)). 

Proof: Throughout the proof we fix K and z G M. (i) According to the previous theorem 

Ptr-,u a {x)>pf' n {d{x,z)) (5.9) 
for all < s < t and all x G M. Hence, for all < r < s < t 

P t -rU r {x) = P t ^ s (P s -rUr) (x) > P t -s (pf " [d(- , z)) ) (x) = P^ s U s (x). 

This proves the monotonicity and thus the existence of the limit, 
(iii) follows immediately from (15. 9h as s — > 0. 
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(ii) Given s > 0, for each e £ (0, s) the function v E (t,x) := Pt~ £ u £ (x) is a nonnegative solution 
to the heat equation on [s, oo) x M. Hence, in particular it satisfies the parabolic Harnack 
inequality and, with \dB~ (z,r)\ := (d/dr)m(B~(z,r)), 

r poo 

v £ (t, x) m(dx) = / u £ (x)m(dx)= / p e ' n (r) • \dB~(z, r)\ dr — > p{z) 
m Jm Jo 

uniformly in e £ (0, s) as s — > 0. Thus the monotone convergence of v £ (t,x) together with the 
compactness of M imply uniform convergence in x as well as L 2 -convergence (for each fixed 
t > 2s) as e — » 0. Together with the L 2 -contraction property of the heat semigroup this then 
yields that the limit is again a solution to the heat equation on (2s, oo) x M. 
The proof of the weak convergence follows easily from property (iii). Indeed, for each continuous 
function / on M, bounded in modulus by C, we obtain 

f(x)pt(x, z) m(dx) = —C + / (f(x) + C)pt(x, z) m(dx) 



>-C+ l(f(x) + C)^-p?> n (d(x,z))m(dx) - f(z) 



as t — > 0. Similarly, we deduce limsup^ J f (x)pt(x , z) m(dx) < f(z) which then proves the 
claim. ■ 



6 The Finsler Structure of the Wasserstein Space 

In this chapter, we introduce the Finsler structure of the Wasserstein space over a smooth, 
compact Finsler manifold. This concept goes back to Otto's pioneering work for Euclidean 
spaces ( |Otj ). Our discussion follows ( [Vil| and) [AGS} §8] for Hilbert spaces and |Vi2] for 
Riemannian manifolds as well. 

We denote by V{M) the set of all Borel probability measures on M, and 7 3 ac (M) C 'P(M) 
stands for the subset consisting of absolutely continuous measures with respect to m. Given 
fi, v € V(M), we say that tt £ V(M x M) is a coupling of (//, v) if its marginals are p and v. 

Definition 6.1 For /x, v £ V(M), we define the I? -Wasserstein distance dwil^,^) by 

d w (p,v) := inf ( f d 2 (x, y) dir(x, y)) , 
71 \JMxM J 

where the infimum is taken over all couplings n £ V(M x M) of (//, v). A coupling tt of (//, v) 
is said to be optimal if it attains the infimum above. 

Given nonnegative functions p, a £ L 2 (M) with /x := pm,v := am £ "P ac (M), consider the 
coupling 7r of (p, v) given by 7r = diag[j(min{/x, v\) + (p — v)+ x (u — p) + , where diag(x) := (x, x) 
for x £ M and {p — v)+(A) = m&x{p>(A) — u(A),0} for each Borel set A C M. Then we have 

1 /2 

dw(fJ>, v) < diam(M)| [(p - v) + X (y - p) + ] (M x M)} 
diam(M) diam(M)m(M) 1 / 2 

= — o — o IIp-^IIl 2 - 

Hence, if a curve in L 2 (M) n V^M) is (locally) Lipschitz continuous as a curve in L 2 (M), 
then it is (locally) Lipschitz continuous also as a curve in V{M). In particular, the heat flow 
constructed in Theorem 13.41 starting from uq £ H 1 (M) with uqhtl £ V(M) is locally Lipschitz 
continuous on (0, oo) as a curve in V{M). 
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A function tp : M — ► R is said to be d 2 / 2- concave if there is a function t/> : M — > R such that 

= V c "(x) := mf{d 2 (x,y)/2 - ijj{y)} 

yeM 

holds for all x G M. Here ip c is called the c-transform of ip. We similarly define the c-transform 
tp c of tp by v? c (y) := inf^gAf {d 2 (s; 2/)/2 — Then 99 < {<p c ) c is always true and ip is d 2 /2- 

concave if and only if <p = {<p c ) c . Moreover, any d 2 /2 -concave function is Lipschitz continuous 
and twice differentiable a.e. (see |Oh3j ). 

We say that tp is d 2 /2 -convex if —p is d 2 /2-concave. Then the Brenier-McCann characterization 
of optimal transport states the following (see [Qh4j ): 

Theorem 6.2 For any \x G V ac (M) and any v G V(M), there exists a unique d 2 j 2- convex 
junction <p : M — > R (up to an additive constant) such that the map T(x) := exp x (Vp(x)) 
is a unique optimal transport from \i to v in the sense that ir := (Id^ x T)^fi is a unique 
optimal coupling of (fi, v). Furthermore, the curve {fM)te[o,l] given by fit = (Tt)$fi with T t (x) = 
exp x (£Vp(x)) is a unique minimal geodesic from fi to v. 

The next lemma is an analogue of the Riemannian one in |Vi2j . 

Lemma 6.3 There exists a positive constant e > depending on M such that, if a C 2 -function 
ip on M satisfies 

d 2 

supM<e, 8upF(V(-p)) <e, -3 ^o 7 (t)]<e (6.1) 
M M dt z t=o L 

along every unit speed geodesic 7, then p is d 2 / 2- concave. 

Proof: Thanks to the compactness of M, there are constants c, 5 > such that 



dt 2 



t=o 



\d 2 {i(t),y) 



> c 



holds for any y G M and unit speed geodesic 7 with d(7(0),y) < 5 (see |Sh3|, Remark 15.1.4] or 
|Oh3j ). (To be precise, the above inequality holds in the weak sense if 7(0) = y.) In particular, 
the backward open ball B~(y,5) is convex for any y G M. It costs no generality to assume 
5 < 4. We put e := min{<5 2 /4, c/2} and suppose that a C 2 -function p satisfies the condition 
(163]) for this e. 

For each y G M, consider the function f y (x) := d 2 (x, y)/2 — p(x). By construction, f y is strictly 
convex ((d 2 / dt 2 )\ t= o(f y 7) > c/2 along any unit speed geodesic 7 with d(j(0),y) < 5). Given 
x G" B~(y,6), we observe f y (x) > 5 2 /4 > p(y) = f y (y). Hence f y attains its minimum at a 
unique point in B~(y, 5). 

Fix arbitrary x G M and put y = exp 2 ,(V(— p>)(x)). Note that d(x,y) < e < 5 by assumption. 
Then we have D(—d 2 (-,y)/2)(x) = D(—p)(x) and hence Df y (x) = 0. This implies that x is 
the unique minimizing point of f y , so that p c (y) = f y (x) = d 2 (x,y)/2 — p(x). Therefore we 
find (p> c ) e (x) < d 2 (x,y)/2 — p c (y) = p>(x). As the reverse inequality is always true, we obtain 
(ip°Y(x) = <p{x) for all x G M, which shows that tp is G? 2 /2-concave. ■ 

In particular, for fixed fj, G "P ac (M) and any C 2 -function tp, the map T(x) := exp x (V(— ctp){x)) 
is the unique optimal transport from /j, to Tjj/U provided c > is sufficiently small. Thus we 
arrive at the following notion of tangent and cotangent spaces. 

Definition 6.4 For each fi G V(M), we define 

T^V := {$ = Vp : p G C™{M)f w{ ^'\ 
T*V := {a = Dp : p G C°°(M)} F ^ (/v) , 
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where the closures are taken with respect to the Finsler structures (Minkowski norms) depending 
on p,: 



F^(p,a) := 




Note that here the completion may be equally understood as forward completion or backward 
completion. Indeed, since by assumption (1.2) (or (1.3)) the norms F(x,-) and F(x,-) are 
locally equivalent and we are now in a compact setting, convergence of <£ n to <3? in the sense of 
Fw(jjl, <3? n — 3>) — > is equivalent to convergence in the sense of Fyy(p, — $ n ) ~^ 0. Similarly, 
elements of T^P consist of equivalence classes of vector fields $i,$2 with Fw(/j,,$i — $2) = 
or equivalently with Fw([i, $2 — $1) = 0. 

Let us remark that Fyy and F^ are dual to each other if we define a pairing between T*V and 
T^Vhy 

J M 

where {-,-) x denotes the natural pairing between T*M and T X M. The Legendre transform 
Jfr(fi, •) : T;V T^V is defined by 

(x = (x 1 ^ a(x)) i — > Jwi^i ol) = {x J*(x, a(x))) . 

Similarly to J*, J^(/U,a) is the maximizer of the function 

$ (a,*) M -ii^( M) *)-ii^(M,a) 
and F w (/i, J^(/i, a)) = a). 

Recall that the relative entropy Ent(/i) of p, € V(M) is defined by 

Ent(/i) := / plogpdm £ (—00,00] 

if // = pm G 'Pac(M), an d by Ent(/i) := 00 otherwise. According to |St2j and [LVlj . we say that 
(M,F,m) satisfies the curvature- dimension condition CD(K, 00) for some X S E, if the relative 
entropy is /T-convex in the sense that any p,u E 'P(M) admit a minimal geodesic (pt)te[o,i] from 
p to v such that 

Ent(p t ) < (1 - i)Ent(//) + iEnt(z^) - — (1 - t)td w (p, v) 

holds for all t 6 [0,1]. A similar, but more involved convexity property is used to define the 
curvature- dimension condition CD(K,N) for arbitrary real numbers N > 1. 

Theorem 6.5 (iV-Ric > K equals CD(K, N), [Oh4]) For a compact, smooth Finsler space 
(M,F,m), the bound oo-Ric > K in the sense of Definition I5.il is equivalent to CD(K, 00). 
More generally, iV-Ric > K in the sense of Definition 15.11 is equivalent to CD (K,N) in the 
sense of Lott and Villani [LV2j and Sturm [St3j . 

We recall one striking application. 
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Theorem 6.6 (Lichnerowicz Inequality, [Oh4]) Let (M, F,m) be a compact smooth Finsler 
space satisfying the bound iV-Ric > K for some K > and N G [n, oo]. Then for any Lipschitz 
continuous function u : M — > R lozf/i J\, u dm = 0, we have 

u 2 dm < ——— [ FCVu) 2 dm. 



>M KN JM 

In other words, with notations from 

In the case N = oo, the constant on the RHS should be understood as K. 

7 Heat Flow as Gradient Flow in the Wasserstein Space 

We continue our analysis of the Wasserstein space over a smooth, compact Finsler manifold. 
Using the continuity equation (|7.1|) below, we will see that the heat flow with respect to the 
reverse Finsler structure is regarded as the gradient flow of the relative entropy. See [JKO] 
for original work on Euclidean spaces and [Qh2j . |Sav| and [Vi2j for related work on various 
Riemannian spaces. 

We first observe that the Wasserstein distance is actually interpreted as the distance associated 
with the Finsler structure introduced in Definition 16.41 The next lemma is an analogue of [AG SI 
Theorem 1.1.2] with a slight modification caused by the nonsymmetric distance. 

Lemma 7.1 For any locally Lipschitz continuous curve (fj,t)tei C V(M) on an open interval 
/CR, the (forward) metric derivative 

j A*max{s,t| J 

\Ht\ ■= am , , — — 

s— >t \t — S\ 

exists at a.e. t <E I. Moreover, \fi\ G L^ C (I) and dwil^s^ m) < /* |Ar| dr holds for all s,t G / 
with s < t. 

Proof: Take a countable dense set {vn} C {^t ■ t G J} C V(M) and define the function 
d n (t) '■= dw(v n , m). Note that d n is locally Lipschitz continuous uniformly in n, so that the 
function D(t) := sup n d' n (t) is well-defined a.e. on / and D G L?° Jl). It follows from the triangle 
inequality that 

liminf ^'^ > Suphminf ^ (t) -^ (S) = D{t) 

sit t — S n sit t — S 

for a.e. t G I. Moreover, we deduce from the density of {^ n } that 

dwiVs^t) = sup{d n (t) - d n (s)} = sup / d' n dr < D dr. 

n n J s J s 

Therefore we have 

lim^l =D(t) 

sit t - S 

for a.e. t G I. We similarly obtain lim s ^ t dwi^ti — = D(t) for a.e. t G I and this 

completes the proof. ■ 



Lemma 7.2 Let I CE be an open interval and ([it)tei C V(M) be a locally Lipschitz continuous 
curve. Suppose that a Borel vector field <fr(i, x) G T X M on I x M with F(<&) G Lf oc (I x M, dfitdt) 
satisfies the continuity equation 

dtm + div($ ty u t ) = 
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in the weak sense that 

[ [ {df<lH + Lhk($t)}dtHdt = (7.1) 

J I J M 

for all tp £ C^°(I x M), where $ t := •) and ip t := ip(t,-). Then we have F w (/j, t ,<f> t ) > \fi t \ 
for a.e. t £ I. 

Proof: Fix s,t £ I with s < t. We denote by T[ s ,t] the set of absolutely continuous curves 
7 : [s,t] —* M endowed with the uniform (supremum) topology, and define the evaluation map 
e r : ^[s,t] — > M at t E [s,t] by e r (7) := 7(t). By virtue of [AGS, Theorem 8.2.1], there exists 
a probability measure IT £ ^(IVt]) such that (e T )j|Ii = /i r for all r G [s, i] and that IT is 
concentrated on the set of curves 7 solving 7(7") = <J> t (7(t)) for a.e. r € [s,t]. Since 

d 2 (7(«),7W)<(t-«) /V( 7 (t),7(t)) dr=(t-s) f F 2 { 1 (r)^ T { n (r)))dT 
holds for Il-a.e. 7, we see 



■ [«>*] 

Hence we have |/it| < Fw(fHi ®t) f° r a - e - i € /. 



d 2 (7( S ), 7 w)n(d 7 )) i/2 < {t- sfi^jy^r^dry 2 . 



Theorem 7.3 Lei / C R k on open interval and (nt)tei C V(M) be a locally Lipschitz con- 
tinuous curve. Then there exists a Borel vector field $>(t,x) £ T X M on I x M with F(&) £ 
Lfo c (I x M,dfj,fdt) satisfying the continuity equation ( |7. ip . Moreover, such a vector field is 
unique up to a difference on a null measure set with respet to d/itdt and satisfies F\y(fit, <3?t) = |/if| 
a.e. t 6 J. 

Proof: Without loss of generality, we assume that I = (0, 1) and (nt)t&i is Lipschitz continuous. 
We consider the functional \& on the space V := {Dip = (Dipt)tei '■ ^ £ x defined by 



:= - / / dt^tdfitdt. 
J I J M 

Clearly \& is well-defined and linear. We equip V with the norm 

1 /2 

Ff(L»V) := (JJ^F* 2 (x,DMx))vt(dx)d?j . 

Given Dtp £ V, we see 

£ 1°JiJm e 

= ^J}e{Sj id ^-Sj^} dt - 
Denote by vr 4it+e the optimal coupling of (fit, fJ-t+e)- Taking 

^t(y) - ipt(x) = F*(x,Dip t (x))d(x,y) + o(d(x,y)) 
into account, we deduce that 

=liminf / -{ / {^(y) - ^(z)}7rt, t+e (dxdy) 1 dt 

eiO 7/ £ I JMxM ) 

-{LI ^ ^' DMX) ^ ^ t{dX) dt ) 1 2 H 4 o nf ( / dw ^ t+£)2 dt 1 2 
= F^(DiP)^j\fi t \ 2 dt ' 
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We similarly obtain ^(Dxfj) > -F v (D(-^))(fj \p t \ 2 dt) 1 / 2 . Hence * is a bounded functional 
and extended to the closure V with respect to Fy. 

Thus we find unique a £ V (up to a difference on a null measure set) maximizing the functional 
_ p*2 j2 on y \y e se t § t ~ J*y(fj, ti at) and observe by contruction that, for any ip G 
C c °°(/ x M), 

/ / {d t i H + DM$t)}d fH dt = 0. 
J i Jm 

This is nothing but the desired continuity equation (|7.1|) . Strict convexity of the norm squared 
Fy 2 /2 ensures that a is actually a unique element satisfying (|7.ip . 
Take a sequence {-DV^jneiN C V converging to a. Then we find 

F v 2 (a) = lim / / F>^ (n) ($t) e*A*t dt = lim *pV (n) ) 

< Yim^FyiD^^j^dt^j =Fy(a)^J^ t \ 2 dt^ . 



Combining this with Lemma [731 shows Fwi^t^t) = |At| a.e. t £ I. 



Definition 7.4 For each locally Lipschitz continuous curve (nt)te.i C V(M), we denote by 
/i< G T^T 7 its tangent vector field given by Theorem 17.31 

Corollary 7.5 For any p, v G V[M), we have 

(Mt)te[o,i] \J0 / 

where the infimum is taken over all locally Lipschitz continuous curves (/^)te[o,i] C V(M) with 
fio = fi and fi\ = v. 

Proof: Recall that F]y(p, t , At) = |Ad a - e - by Theorem 17.31 Then the inequality < follows from 
Lemma 17. 11 and equality is attained by a minimal geodesic from fj, to v. ■ 



For fj, G V(M), we define the exponential map exp M : Tjj'P — > V(M) by exp A1 ($) := (exp^^p. 
Given a function S on (a subset of) V(M), we say that 5 is differentiable at ju € V{M) in 
direction $ G T^T 3 if the directional derivative 

^ J no t 

exists. We say that S is differentiable at p G 'P(M) if there exists a G F*"P such that (a, = 
Dq,S(n) holds for all $ = Vip G T^P with ip G C°°(M). In this case, this a is denoted 
by DS{n) and called the derivative of S at p. The gradient vector of S at p is defined by 
V w S(n):= Jfr(n,DS(jj)). 

Definition 7.6 A continuous curve (pt)t>o G 'P(M) which is locally Lipschitz continuous on 
(0, oo) is called a gradient flow for S if pt = V^(— S'Xpt) holds at a.e. i G (0, oo). 

Proposition 7.7 Take p = pm G P ac (M) swc/i t/urf p G H l {M). If -log p G" H l {M,p), then 
— Ent is not differentiable at p. If — log p G H l (M,[i), then — Ent is differentiable at p and the 
gradient vector is given by 

V w (-Ent)(p) = ~V(-p) G T^P. 
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In particular, its norm squared F^(p, Vpp(— Ent)(p)) coincides with the Fisher information 
with respect to the reverse Finsler structure F : 

T(p) := / < F 2 (x, Vp(x)) — 3— m(dx) = [ F 2 (x, V(-p)(x)) — ;— m(dx). 
Jm P\ x ) Jm P\ x ) 

Proof: Fix arbitrary ip G C°°(M) and put $ := V<p, *7 := {x G M : = 0}. By 

virtue of Lemma 16.3} the function tip is d 2 /2-convex for sufficiently small t > 0. Hence the 
map T 4 (x) := exp a ,(t$(x)) is the unique optimal transport from p to pt := (T t )^p. We will 
use some properties of T t and pt established in [Oh4| . The map T t is injective on a subset 
of //-full measure and pt is absolutely continuous, so that we can write \i% = ptm. The map 
Tt is C°° on M \ Uq as Ti(x) is not a cut point of x. For p-a.e. x G M, we have the Jacobian 
equation p(x) = pt(T t (x))'D[DTt(x)]. Here D[DT t (a;)] denotes the Jacobian of the linear operator 
DTt(x) : T^M — > T Tt ^M with respect to m. That is to say, ~D[DTt(x)] := 1 ii x E Uq, and 

n[nT/ m Tt{x) (DT t (A)) 
B[DTt{x)] = m^A) 

for x G M \ Uq, where A G T X M is an arbitrary nonempty, bounded open set. 

The change of variable formula and the Jacobian equation p = pt{Tt)T)[DTt] show that 

Ent(p t )=/ Pt log pt dm = ( p t {T t ) log (p t (T t ))V{BTt\ dm 
Jm Jm 

•log (d[D7^]) dm = Ent( ^ " J M ^g(D[DT t })pdm. 



M 



Thus we have 



= lim / ^^D[OT t ]dm = - / Dp{<5>)dm. (7.2) 
If — log p G iT-^M, then we obtain D(-Ent)(p) = -{Dp)/p = D(-logp) and 

V w (-Ent)(/x) = V(-logp) = -V(-p). 

P 

In the other case where — log p H 1 (M, p), we approximate p by smooth positive a and consider 
ip = — logo". Then the above calculation (|7.2I) leads 

Ent(p) - Er&iv) 

limSUp : r = OO. 

»yU d W (p,U) 

Hence — Ent is not differentiable at p. ■ 



Theorem 7.8 Let (pt)t>o C "P ac (M) 6e a continuous curve which is locally Lipschitz continuous 
on (0, oo) ; and assume that pt = Pt m with pt G H 1 (M) a.e.t G (0, oo). Then (pt)t>o is a gradient 
flow for the relative entropy if and only if (pt)t>o is a heat flow with respect to the reverse Finsler 
structure F of F . 

Proof: If (pt)t>o is a gradient flow, then Proposition 17.71 yields that 

pt = V w (-Ent)(p t ) = ^~P^ 

Pt 
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for a.e. t G (0, oo). Then it follows from the continuity equation (17. ip that, for any test function 
if) G C c °°((0,oo) x M), 

/"OO P f'QO P 

^tdtPtdmdt = I I dt^tPtdmdt = - I I Dipt(V(-p t )) dmdt. 
o J m Jo Jm Jo Jm 

Therefore —pt is a heat flow with respect to F or, equivalently, pt is a heat flow with respect to 
F. 

Conversely, if {pt)t>o is a heat flow with respect to F, then a similar calculation shows that 
{V(— pt)}/ Pt satisfies the continuity equation (|7.ip . We remark that, given < to < h < oo, 
approximating p with smooth positive a G C°°([io>*i] x M) and considering ^ = — logc yields 

" ' F2 < v <-"'»^ 



to JM ft 

= / / (9 t p t dmdt - / p tl log p tl dm+ p k) log p to dm 
Jt jm Jm Jm 

= Ent(/x to ) - Ent(yu tl ) < oo. 

Therefore -logpt G H 1 {M,m) and /it = {V(-p t )}/p t = V^(-Ent)(/x t ) for a.e. t G (0, oo) by 
Theorem 17.31 and Proposition 17.71 ■ 



Corollary 7.9 Under the same assumptions as in Theorem 7.8, the following are equivalent 



(1) (nt)t>o is a gradient flow for the relative entropy on the reverse Wasserstein space (i.e., the 

space of probability measures with the reverse Wasserstein distance); 

(ii) (pt)t>o solves the ODE p t = — ViyEnt(/ij) on the Wasserstein space; 

(iii) (pt)t>o solves the heat equation on M. 

Remark 7.10 (1) What is missing in Theorem 17.81 is the contraction property of the heat flow 
in the Wasserstein space which is well-known in the Riemannian setting (see, e.g., [vRSJ and 
[Oh2]). Compare this with Corollary 13.61 As mentioned in [AGS, page 4], even the contraction 
of gradient flows of (^C-)convex functions on Banach spaces is still an open problem. 

(2) In Theorem 17.81 th e existence of the heat flow starting from given pq is guaranteed by 
Theorem 13.41 On the other hand, as the relative entropy is K-convex if (M,F,m) satisfies the 
bound oo-Ric > K (Theorem 16. 5p . we can argue as in [AGSl §2] or [Oh21 §5] (except right 
differentiability for which we need tangent cones) to obtain a continuous curve (p,t)t>o G V(M) 
which satisfies the following properties: 

(i) The curve t \— > pt is locally Lipschitz continuous on (0, oo). 

(ii) For all t > 0, we have 

dwiiH fM + s) = ( VwK _ Ent)0ut)) ; 
<5|0 

ft 2 

Ent(^) = Ent(/u ) - / F w (p, s , V w (-Ent)(^ s )) ds. 
Jo 

Thus, in a certain sense (pt)t>o will be a gradient flow for the entropy. However, it is unclear 
whether it is actually a gradient flow in the sense of Definition 17.61 
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8 Appendix 

8.1 Proof of d t u E Hq for global solutions on compact M 

Let u be a global solution of the heat equation on a compact space M with uq £ iTg(M). We 
know from Theorem 13.41 that v(t, x) = dtu(t,x) exists for a.e. (t,x) and satisfies 

T r 

I v 2 dm dt < £(uq). 
Jm 

For arbitrary 5 £ 1R put v( s '(t,x) = (u(t + 5,x) — u(t,x))/5. Then it follows from (|3.10p that 

d t \\v?\\l 2 <-AK M S{v[ 5) ). 

Hence, for |<5| < r < T 

rT 



At K M [ £(v ( t 8) )dt<4:KM [ [ £{v\ S) )dtds< [ 
Jt Jo Js Jo 



\vi 5) \\h ds 



T /i r s+5 \ 2 f r T+ 



.. ! vtdt\ dsdm< / / v t dt dm < £{uq). 
m Jo \° Js ) Jm Jo 

Therefore, the family {v^ : \S\ < r} is bounded in the norm 

1/2 



J [£(wt} + \\w t f L2 ]dt\ 



of L 2 ([t, T], Hq (M)) for any r > 0. Reflexivity and completeness of Hq(M) then imply the 
existence of v £ L 2 ([r, T],Hq(M)) such that — > -y in the given norm. This in particular 
implies convergence in L 2 and thus v = v. Therefore, vt £ -ffo(M) for a.e. t with locally square 
integrable norm of the derivative F*(Dvt). Note that we used the compactness of M only for 
ensuring km > 0. 

8.2 The same for local solutions on arbitrary M 

For local solutions, essentially the same arguments apply. For each open set Qo relatively 
compact in f2 we choose another relatively compact open set containing the closure of 0,q 
and a (cut-off) function ip £ Hq(Qi) satisfying < ip < 1 and F*(Dtp) < C on M (for some 
constant C) and ip = 1 on f2o- For instance, we can choose ip(x) = d(Qi,Uo U {x})/d(fli, Qo). 
Then a modification of the above calculations yields, with k = k^ and k = (Af^A^ ) _1 

- ~fls I ^{vf ] fdm = i / (^(V'nt+j) - D(^u t ))(Vu t+s - Vu t )dm 

1 YJsii J ./fii 

= 7? / ^{Du-t+s ~ Du t ){Vu t+ s - Vut) dm 

2 Jq 1 

+ -pr / (u t+5 - Ut^ipDipCVut+s - Vu t ) dm 
o 2 Jnx 

i 12 I ^ 2 F* 2 (Du t+5 - Du t ) dm 
° 2 Jdi 

ut+s ~ *i*||L 2 (ni) • i^J^ ip 2 F* 2 (^(Du t+S - Du t )j drn^j 



S 

^ K f Z7*2. n (<5K , 2C 2 K 2 (5) 2 

-2/ ^ < • ^ k~ "* "^(nO- 
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For the inequality (*) we use in addition to the previous argument the fact that 

F(j*{a) - J*(j3)) < ~KF* (a - 0) 

which follows from our basic assumption (|1.2p on F since for some intermediate point 7 G T*M 
between a and (3 we have 

[J* (a) - J*(/3)] • J[J*(a) - J*(/?)] = g* (7) (a - 0) ■ J[J» - J*(/?)] 

which implies 

F 2 {j*(a)- J*(/3)) <F 2 {g*( 7 ).(a-P)) 

<-L(a-f3) T -g*( 7 ) T -g*( 7 )-(a-P) 

<3-^l«-^| 2 <T2V^ 2 (a-/?)- 

Hence, for |<5| < r < T 

T pT pT+r 

£n {v\ S) )dt< I ||u^||ia (ni )dfl< / ||««|lia (ni ) dfl ^ ^hi(«o) 
JO JO 

with c = kt(1 + 4C 2 kT/k) _1 . The same argumentation as before now implies that vt G Hy (Q) 
for a.e. t provided uq G 11^(0). 

8.3 Proof of u £ Hf oc for local solutions 

Let fii C fi be an open subset on which a global coordinate system is given. Fix k G {1, . . . ,n} 
and put x s = x + 5eu for small 5 G R as well as D k (p(x) = (p(x s ) — ip(x))/8. Observe that 

D{{^)(x) = ip{x S )D{xp{x)+^{x)D{ip{x) 

for all compactly supported tp and ip on Ox- If u is a solution to the heat equation then for every 
test function <p which is compactly supported in fli 



(D^<p)(d t u)e- V m{dx) 
= J[D(D- 5 p) ■ J*(Du)]e- v m[dx) = - j Dp ■ D 5 k (j* (Du)e~ v ) m(dx] 
= - j [Dp{x) ■ D 5 k (j*(Du))(x)]e- v{xS) m(dx) 
- J [D<p(x) ■ J*(Du)(x)}D 5 k (e' v{x) )m(dx). 
On the other hand, 

- [ (D k 5 p)(d t u)e- y m(dx) = j pD 5 k (dtu-e~ v ) m(dx) 

Lp(x)d t (D s k u)(x)e- v(x) m{dx) + / p(x)d t u(x s )D s k ( y e- v(x) ) m(dx). 
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That is, 



- J fd t (D 5 k u) dm = J [D<p(x) ■ D 5 k (j*(Du))(x)]e v(x ^ v ^m{dx) 
+ J [D<p(x) ■ J*(Du)(x)}e vix) D s k (e- v(x) )m(dx) 
+ / ip(x)d t u(x s )e v ^D 5 k (e- v ^)m( y dx). 



(8.1) 



To simplify the presentation, let us first of all treat the particular case where u is a global 
solution on Q\, i.e. u £ 2Jq(Oi). This allows to choose ip = D k u which then yields 

-\d t [ \D{u{x)\ 2 m{dx) = [ [Di{Du)(x)-D 5 k (r(D U ))(x)]e v(x) - v(xS) m(dx) 



+ J [D s k (Du){x) ■ J*(Du){x)]e v( - x) D s k (e- v( - x) )m(dx) 
+ f D 8 k u(x)d t u{x s )e v{x) D s k {e- v(x) )m{dx). 



We will estimate each of the three terms on the right-hand side from below (or in modulus). Using 
the bound \D k V\ < A from our assumption (fO|) we obtain \e v D 5 k (e~ v )\ < (e 5A - 1)/S < 2A for 
all sufficiently small 5 and thus we can estimate the second term as follows 

J \D{{Du) ■ r(Du)]e v D 5 k (e~ v )dm 

<2A^ F* 2 (D s k Du)dm^ ' (J F* 2 (Du)dm 

= 4A£ ni (D 8 k u) 1 / 2 £n 1 (u) 1/2 - 
The third term can be estimated as 

D{u{x)d t u{x s )e v{x) D{{e- y{x) )m{dx) 
<2Ae 5A / 2 || J Dfn|| i2(ni) ||a t n|| i2(ni) . 

Finally, using the bound 

F(x,J*(x,a) - J*{x s ,a)) < -j=\J*{x,a) - J*(x 5 ,a)\ < ^=^= 



a 



5A 



< -^F*(x d ,a) =: 6A'F*(x s ,a) 



for all a from assumptions (jl.3p . (|4.4p as well as the basic convexity assumption (jl.5p of the 
norm F* with k := kq 15 the first term (times e 5A ) yields 

e 5A [ [D{{Du){x)-D{{r{Du)){x)]e v{x) - v{ - xS) m{dx) 



> 



I [Du(x s ) - Du(x)} ■ [j*(x S ,Du(x 5 )) - J*(x,Du(x))] m(dx) 



> ^ / [Du(x s ) - Du(x)] • [J* (x, Du{x 5 )) - J* (x, Du(x))] m(dx) 



T 



F*(x,Du{x 5 ) - Du(x))F*(x 5 ,Du(x s )) m{dx) 
>n J F* 2 (x,D 5 k u(x))m(dx) 

-A' J F*(x,D s k u(x))F*(x s ,Du(x s )) m(dx) 
> 2 K £ ni (D 5 k u) - 2A'e 5k / 2 £n 1 {Dlu) 1 l 2 £nM 112 - 
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Summarizing and integrating with respect to t S [0,T], we obtain 

dt 



l\\D 5 k u \\ 2 LHQi) > -\ I <h 



T 

D d k u t (x)\' 2 m{dx) 



o 

f T f T f T 

Jo Jo Jo 

We know that J Q T \\d t u t ||| 2(ni ) dt = £fii(«o) - Sq^ut) < Sq^Uq) and £nd u t) < £hi(«o) for 
every t > 0. Moreover, 



S 1 1 2 



6 



D k u t (x + te k )dt 



o 



2 



-V(x) 



m(dx) 



< 



\D k u t (x)\ 2 e- v ^m(dx) < C£ Ql (u t ). 



Hence, 



f S Ql (D 5 k u t ) dt < C'£ Ul (uo) < oo 
•/ o 



uniformly in 5 (provided |<5| is sufficiently small). Thus wt = D k ut = lim^Q Dtut exists in 
Hq(£Ix) for a.e. t and satisfies Jq £Q 1 (wt) dt < oo. 

In order to treat the general case, let us now merely assume that u is a local solution. Given any 
point in M, we find a neighborhood f^o and another relatively compact open set f^i containing 
the closure of f^o and admitting a global coordinate system. We choose a (cut-off) function 
ip € Hq(Oi) satisfying < ip < 1 and F*(Dip) < C on M (for some constant C) and ip = 1 on 

Now let us put 93 = i[) 2 D k u in (|8.1f) . Then all the integrals / • • • m(dx) in the previous calcula- 
tions have to be changed into J ■ ■ ■ ip 2 (x) m(dx). In particular, the leading order term will then 
be of the form 

k [ F* 2 (x,D s k u(x))^ 2 (x)m(dx) >2n£ no (D s k u). 

Moreover, due to Leibnitz rule, two additional terms will show up (from differentiating the first 
factor in 93 = ip 2 D k u with 
of the above 'leading ore 
section. It finally implies 



factor in (p = ip 2 D k u with respect to D). However, these terms can easily be estimated in terms 
of the above 'leading order term', £n 1 (u) and ll-D^ll^ni)) cf. estimate (*) in the previous 



f £n (D s k u t ) dt < C£ ni (u ) < 00 
Jo 

uniformly in 5 and thus D k u t G H^ 0C (Q) for a.e. t. 
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